Getting everyone to agree on gene signatures for murine macrophage polarization in vitro

Macrophages, key players in the innate immune system, showcase remarkable adaptability. Derived from monocytes, these phagocytic cells excel in engulfing and digesting pathogens and foreign substances as well as contributing to antigen presentation, initiating and regulating adaptive immunity. Macrophages are highly plastic, and the microenvironment can shaper their phenotype leading to numerous distinct polarized subsets, exemplified by the two ends of the spectrum: M1 (classical activation, inflammatory) and M2 (alternative activation, anti-inflammatory). RNA sequencing (RNA-Seq) has revolutionized molecular biology, offering a comprehensive view of transcriptomes. Unlike microarrays, RNA-Seq detects known and novel transcripts, alternative splicing, and rare transcripts, providing a deeper understanding of genome complexity. Despite the decreasing costs of RNA-Seq, data consolidation remains limited, hindering noise reduction and the identification of authentic signatures. Macrophages polarization is routinely ascertained by qPCR to evaluate those genes known to be characteristic of M1 or M2 skewing. Yet, the choice of these genes is literature- and experience-based, lacking therefore a systematic approach. This manuscript builds on the significant increase in deposited RNA-Seq datasets to determine an unbiased and robust murine M1 and M2 polarization profile. We now provide a consolidated list of global M1 differentially expressed genes (i.e. robustly modulated by IFN-γ, LPS, and LPS+ IFN-γ) as well as consolidated lists of genes modulated by each stimulus (IFN-γ, LPS, LPS+ IFN-γ, and IL-4).


Introduction
Macrophages are pivotal components of the innate immune system, known for their remarkable versatility in orchestrating immune responses [1].Derived from monocytes, these phagocytic cells exhibit an exceptional ability to engulf and digest pathogens, cellular debris, and foreign substances [2].Their functions extend beyond mere phagocytosis, as they play a crucial role in antigen presentation, contributing to the initiation and regulation of adaptive immunity [3].
Macrophages are highly plastic [4,5] and the microenvironment can shape their phenotype: the two ends of the spectra are classical activated macrophages (M1-polarizated, type I subset; inflammatory) and alternative activated macrophages (M2-polarizated, type II subset; antiinflammatory), both characterized by a specific and fine molecular and transcriptomic pattern [6].Classical activated macrophages can be polarized by different inflammatory stimuli such as Th1-related cytokines including interferon-γ (IFN-γ) and tumour necrosis factor-α (TNFα), alone or in combination with microbial products such as lipopolysaccharide (LPS) or lipoteichoic acid.This leads to the production of pro-inflammatory cytokines, oxidative species and proteases.Though essential in host defence, against pathogens and tumoral cells, M1 macrophages also contribute to several autoimmune diseases, determining tissue damage and organ loss of function [6,7].On the contrary, M2 macrophages can be polarized by IL-4 and IL-13, which are classically produced by Th2-polarized T cells [7,8].M2 macrophages participate, among other functions, in wound healing, tissue repair, as well as fibrosis, airway hypersensitivity, and also helminthic infections [9].While of importance for reductionist research, M1 and M2 are two of the many polarized subsets that may exist (according to the stimulus, the microenvironment, etc) [10,11], and, to this end, M1/M2 mixed phenotypes have also been described [12].
M1 and M2 macrophages exhibit distinct gene expression signatures, reflecting their diverse functions and roles.M1 macrophages, often associated with pro-inflammatory responses, are characterized by the upregulation of genes such as IL-1β, IL-6, TNF-α, and inducible nitric oxide synthase (iNOS).These molecules play essential roles in host defence against infections and the promotion of tissue damage during inflammatory conditions [9,13].In contrast, M2 macrophages, considered anti-inflammatory and tissue repair-promoting, display an expression profile marked by genes like Arginase-1 (Arg1), IL-10, and the mannose receptor (CD206).These genes are associated with immunomodulation, extracellular matrix remodelling, and wound healing [7,9,13,14].
The advent of RNA sequencing (RNA-Seq) represents a revolutionary breakthrough in molecular biology and genomics.This technology has ushered in a new era of high-throughput transcriptome analysis, providing researchers with an unprecedented level of insight into gene expression patterns and regulation [15].Unlike microarray technology, RNA-Seq offers a more comprehensive and quantitative view of the transcriptome.It enables the detection of both known and novel transcripts, alternative splicing events, and rare transcripts, facilitating a deeper understanding of the functional complexity of the genome [15].Moreover, RNA-Seq is highly sensitive and can quantify gene expression across a wide dynamic range [16].
The rapid decrease in costs associated with RNA-seq has significantly increased the number of datasets deposited in public databases and this is even more true for macrophages.Yet, to some surprise, there is no attempt to consolidate the information coming from the different murine or human datasets.This would have the advantage of reducing the intrinsic noise and allowing for authentic signatures, not driven by the scientific context and question.
M1 and M2 gene expression signatures are traditionally defined from historical findings.While this approach undoubtably is based on biologically-relevant genes it lacks a systematic approach to define the most robust possible signature.In 2015, Jablonski et al. applied a systematic approach to better define murine M1 and M2 polarization signatures [17] from a single microarray dataset.In the present manuscript, we took a step further and combined all the published murine RNA-Seq databases deposited so far to portray a robust polarization profile.

Datasets and exclusion criteria
The NIH Gene Expression Omnibus (GEO) database was thoroughly queried to pinpoint datasets encompassing whole-genome transcriptome information of macrophages.We have used the following string:
We excluded those studies whose data were not available on Sequence Read Archive (SRA).

RRA discovery study-Identification of robust differentially expressed genes by RNA-seq data RRA method
To integrate results from multiple datasets, we employed the robust rank aggregation (RRA) method, a well-established tool for analysing data from diverse arrays [18].Initially, we conducted an RRA discovery study using data extracted from all RNA-Seq datasets available.For each array dataset, we downloaded both the raw fastq files and the associated annotation document from the NCBI Sequence Read Archive (SRA) repository.
For each dataset, raw fastq files were processed through the nf-core/RNA-Seq pipeline version 3.6 [19] and mapped using star_rsem aligner against the mouse reference genome GRCm38.FeatureCounts module from the subread package was used for counting reads associated to genomic features.To compute differential expression analysis based on the negative binomial distribution independently for each dataset, we applied 'DESeq2' Bioconductor R package [20]; only genes with an adjusted p-value �0.05 and a fold change greater than |1.5| were selected for further analysis.
For each signature, the generated lists of differentially up-regulated and down-regulated genes in each dataset were then aggregated using the 'Robust Rank Aggregation' Bioconductor R package.Only genes with a p-value�0.05(RRA score) and whose fold-change was conserved in each independent dataset were retained in the final signature.Complete lists of signatures are listed in S1 and S2 Files.

Ingenuity pathway analysis and IPA comparison analysis
For each signature, functional enrichment of differentially expressed genes was performed using IPA (QIAGEN).Pathways belonging to the Canonical Pathway Analysis section were considered for the analysis (-log 10 p-value �1.3).

Unravelling the expression signatures of M1 and M2 macrophages
M1 and M2 polarizations are traditionally investigated by the up-regulation of genes which have been historically associated with these phenotypes (Fig 1).
In the present manuscript, as we wanted an unbiased approach, we first screened to find eligible RNA-Seq datasets for the most commonly used polarizing stimuli: IFNγ, LPS, IFNγ +LPS, IL-4 and IL-13.For IFNγ 5 datasets were eligible, for LPS 12 datasets were eligible, for IFNγ+LPS 13 were eligible, for IL-4 19 were eligible (Table 1).For IL-13 only 2 datasets were found and this was deemed insufficient to perform further meaningful analysis.While other stimuli are known to induce an M1 or M2 polarization [21][22][23][24][25], we chose those described above as they would have been more likely to yield enough datasets to meta-analyse.The analysis was built around rank-robust analysis (RRA) and fold-change.Data in the main text represent the top 20 up-regulated genes for each analysis, while the full analysis (all genes up-and down-regulated) is presented in the supplementary files (S1 File and Supplementary Figs 1-4 in S5 File).
The RRA analysis returned 1283 significant (p<0.05)genes for IFNγ, 1180 genes for LPS, 1319 for LPS+IFNγ and 902 genes for IL-4.As shown in the Venn diagram in Fig 2, a very high proportion of significant genes were unique to the selected stimuli but, as expected, there were a high number of genes shared by the three M1-inducing stimuli (LPS, IFNγ, LPS +IFNγ).As an example, when evaluating the genes up-regulated by IFNγ, 451 genes were solely found significantly and robustly modulated in the IFNγ dataset, in contrast to 121 which were shared with the LPS dataset and 357 which were shared with the LPS+IFNγ dataset.To improve the robustness of the M1 signature over the historical one, we arbitrarily selected those genes that showed a p<10 −7 in each of the three datasets and that were not present in the RRA list for IL-4.The 28 genes resulting from this analysis together with the respective foldchanges are listed in Table 2 and can be viewed as an M1-restricted and consolidated list.We performed a similar analysis using the fold-change as a parameter, setting the cut-off to log2 fold-change > 5. Eleven genes were in common for all three M1 stimuli (Table 3).Few genes, as expected, emerged as shared between the M2-polarizing agent IL-4 and the other three stimuli, both when analyzing RRA or fold-change.
We next proceeded to analyse each stimulus separately.Figs 3A, 4A, 5A and 6A show the most robust gene changes derived from the overall RRA discovery study (S2 File).It can be noticed that when comparing the classical list with the top 20 robust genes, some are shared, while the majority is distinct.

IPA UP-stream comparison analysis
Last, we compared the pathway and functional enrichment results of the M1-and M2-related treatment and observed significant differences in biological functions related to immune cell functions.IPA upstream regulator analysis provides a powerful tool to predict the deregulated functional activities that are possibly affected by the transcriptome data.

Discussion
RNA-Seq has become a standard experimental procedure to delineate the action of polarizing agents on macrophages.This is evident from the high number of datasets we retrieved from public repositories, when searching for those we reckoned are the most common stimuli used in murine research (IFNγ and/or LPS for M1 and IL-4 and/or IL-13 for M2).Indeed, we were able to find a sufficient number of repeated datasets for three out of these four stimuli (as well as for IFNγ+LPS).Given that it is counter-intuitive to repeatedly perform identical    experiments across laboratories without at least attempting to consolidate the information to slimline results and find common trends that reduce experimental noise and chance-findings, we proceeded to meta-analyse the retrieved results.Such effort could yield a smaller, less-time consuming, more robust and cheaper set of genes to investigate polarization as well as reduce animal experiments.It could also yield a set of genes that, when present in RNA-Seq experiments, could be easily associated to a particular stimulus or polarization.Laboratories working on macrophage polarization, when not performing RNA-Seq, pick selected genes which are known in the literature to be associated with either M1 or M2 polarization.Such genes are chosen because of their important inflammatory function associated with polarization, of their perceived solidity regarding changes, of personal experiences, and are also influenced by current trends in the literature.We have attempted, albeit not systematically, to collate the most commonly used genes in Fig 1 .In Tables 4 and 5, we compared this "historical" list of genes associated with M1 or M2 polarization with the findings from our analysis, both in terms of RRA and for fold-change.As shown, a number of traditionally used genes do not appear robust as one would hope for when analysing the single stimuli.
To overcome this arbitrariness, already known in the literature, in 2015 Jablonski et al. [17] used a single microarray dataset to propose a novel list of markers to delineate M1-and M2-signatures.This list was collated by choosing genes that were only up-regulated by M1-or M2-polarizing conditions and genes were ranked mainly on fold-change.As an M1 polarizing-agent, the Authors chose IFNγ+LPS while for M2, the Authors chose IL-4.While there are similarities between the two reports, there are also some clear differences.First, the work by Jablonski was performed using an Affymetrix microarray platform while we used RNAseq experiments.Second, our work expanded the realm of polarization, adding LPS and including also IFNγ alone.Most importantly, though, our work capitalized on multiple datasets to reduce noise and inter-experimental differences and used robust rank aggregation (RRA) as well as fold-change.The two signatures proposed, though, can be compared on the top ranked genes of IFNγ+LPS and IL-4 treatments.Of the top 20 highest fold-change genes reported in A different way to use our analysis is to look at the most robust genes and understand their role and whether they have been object of previous investigations.Focusing solely on the top 20 lists, genes can be categorized in different categories: those for which a solid and consolidated role in macrophage polarization has been previously established (e.g.Gbp2, Fgl2 [26], Nampt [27]), those that have so far not been object of investigation, and given the robustness of the change or the fold-change would deserve so (Clic5, Styk1, Cbib) and those which appear somehow contradictory to common knowledge (for example, Cxcl16 is reported to be a gene that is upregulated both in M1 and M2, but our analysis depicts it as a M1-specific gene).In other words, our work represents a good starting point to fill the gaps of knowledge in macrophage polarization by highlighting genes not yet looked at or genes that require further understanding.Our work also provides a list of 28 genes that are highly significant (p<10 −7 ) in all the three M1 stimuli and could be used as a starting point to define an agreed global M1 signature, when validated with other M1-polarizing agents (e.g.Tnf, Cxcl10).
The present work should be read in light of the following weaknesses.First, we solely looked at murine macrophages and did not investigate whether the same changes are robust in the human counterparts.It is possible and likely that there will be differences between the two species due to a number of reasons, including species-specific genetic variations and transcriptional mechanisms.While a direct comparison between our analyses and the published human transcriptomic analyses (for example [28]) is possible, it would be best to compare lists generated via similar meta-analytical approaches to reduce noise and experimental variations.Second, we opted to analyse gene changes in a restricted time-frame, which is represented by the classical time-points used in experimental settings (between 2-8 hours for M1 polarizing agents and 18-24 hours for M2 agents) in macrophages from healthy mice using the most traditional M1-and M2-stimuli.Yet, (i) it is recognized that gene changes are dynamic; (ii) a number of other polarizing stimuli (for example, IL-10, IL-13 TGF-b, PGE-2) have been reported; and (iii) metabolic perturbations ((e.g.oxygen levels, pH, nutrient scarcity) within the microenvironment in health and disease (e.g.oxygen levels, pH, nutrient scarcity) profoundly affect the skewing and functional state of macrophages.Therefore, our data is a starting point but should be refined looking at those genes of interest from several different perspectives.Third, we solely looked at transcription, and did not investigate whether these changes translate in modifications in protein levels nor at the functional consequences of these changes.Last, this was an unbiased analysis that gave equal weight to the datasets retrieved, over which we had no control for quality, although given the systematic approach that made use of 39 experiments, noise and interexperimental variability were minimized.Overall, the present paper used a meta-analytical approach to provide the genes that are most robustly changed upon macrophage polarization and the genes which appear to be the most up-regulated using fold-change.It is interesting to note that the two outcomes are not superimposable, and that very few genes are present in both lists.This would lead to consider RRA, rather than fold-change, as a better index for consistency across laboratories.The provided lists may be used as reference when investigating murine RNA-Seq experiments performed with other agents as well as a starting point to understand the role of those genes that have emerged and for which little is known at present.

Fig 1 .
Fig 1. Cartoon depicting M1 and M2 characteristics, including the classical genes usually used to determine polarization.https://doi.org/10.1371/journal.pone.0297872.g001 Fig 7 describes the top-10 upstream regulators predicted to be activated or inhibited and the significant enrichment score (the complete analysis is present in the S3 File).For IFNγ stimulation (Fig 7A), we observed an activation of regulators (MYD88, IFNG and TLR4) associated to the Macrophage Classical Activation Signalling Pathway (Z score = 3.92 S4 File), as for LPS (Fig 7B).Moreover, LPS also activated regulators (TNF, MYD88, and IFNG) associated to Pathogen Induced Cytokine Storm Signalling Pathway (Z score = 6.04,S4 File).These activated pathways were also observed in the LPS+ IFNγ stimulation (Z score 3.91 and 5.21 respectively), whose regulators were TICAM1, TNF, TLR4 and NOD2.For IL-4, we found a significant enrichment in Macrophage Alternative Activation Signalling Pathway (Z score = 5.02, S4 File), whose regulators were IL-4, TREM2 and STAT6.

Fig 3 .Fig 4 .Fig 5 .
Fig 3. IFNγ up-regulated DEGs identified by RRA and fold-change analysis.(A) On the left, the heatmap of the five datasets showing the top 20 most robust genes upregulated is depicted.Value in the boxes represents fold-change and shade of blue represents RRA score.On the right, the corresponding average RRA score and fold-change.(B) On the left, the heatmap of the five datasets showing the top 20 most upregulated genes which also present a significant RRA score.Value in the boxes represents fold-change and shade of blue represents RRA score.On the right, the corresponding average RRA score and fold-change.https://doi.org/10.1371/journal.pone.0297872.g003

Fig 6 .
Fig 6.IL-4 up-regulated DEGs identified by RRA and fold-change analysis.(A) On the left, the heatmap of the seventeen datasets showing the top 20 most robust genes upregulated is depicted.Value in the boxes represents foldchange and shade of blue represents RRA score.On the right, the corresponding average RRA score and fold-change.(B) On the left, the heatmap of the seventeen datasets showing the top 20 most upregulated genes which also present a significant RRA score.Value in the boxes represents fold-change and shade of blue represents RRA score.On the right, the corresponding average RRA score and fold-change.

Table 1 . Publicly available murine datasets used in the present study.
BMDMs: bone marrow-derived macrophages.PECs: peritoneal exudate cells.

Table 4 . Comparison between the M1-"historical" murine genes with our analyses.
Numbers in parenthesis highlight the rank from our analysis.https://doi.org/10.1371/journal.pone.0297872.t004the present manuscript for IFNγ+LPS, 4 genes are in common with the top 17 highest foldchange genes from Jablonski.Of the top 20 highest fold-change genes reported in the present manuscript for IL-4, 5 genes are in common with the top 19 highest fold-change genes from Jablonski.Such concordance is high, in our view, also in consideration of the arbitrary cut-off of highest fold-change chosen (i.e. a gene not present in the top 20 might well be significantly regulated and be ranked below) and of the use of different technological platforms.Indeed, of the published list of 17 genes for IFNγ+LPS, only 5 genes, independently of ranking, did not result statistically significant in our RRA analysis (Cxcl3, Ptges, Cd200, Ascl1, Ppap2a).Of the published list for IL-4, only 6 did not result statistically significant in our RRA analysis (Ear1,